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Dedication 

This article is written for Ingram Olkin on the occasion of his 80th 
birthday. Ingram has provided inspiration for me over the last 40 years 
and continues to inspire. I am indebted to him for his encouragement 
and support throughout my career. I am contributing this humbly in 
the sure knowledge that he could have written it better than I. 
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Abstract. The appearance of Marshall and Olkin's 1979 book on in- 
equalities with special emphasis on majorization generated a surge of 
interest in potential applications of majorization and Schur convexity 
in a broad spectrum of fields. After 25 years this continues to be the 
case. The present article presents a sampling of the diverse areas in 
which majorization has been found to be useful in the past 25 years. 

Key words and phrases: Inequalities, Schur convex, covering, waiting 
time, paired comparisons, phase type, catchability, disease transmis- 
sion, apportionment, statistical mechanics, random graph. 



1. INTRODUCTION 

Prior to the appearance of the celebrated volume 
Inequalities: Theory of Majorization and Its Appli- 
cations (Marshall and Olkin, 1979) many researchers 
were unaware of the rich body of literature related to 
majorization that was scattered in journals in a wide 
variety of fields. Indeed, many majorization concepts 
had been reinvented and often rechristened in dif- 
ferent research areas (e.g., as Lorenz or dominance 
ordering in economics), complicating the difficulties 
for the researcher when trying to relate current re- 
search to the extant corpus. Of course, the appear- 
ance of the Marshall and Olkin volume changed all 
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that. They heroically had sifted the literature and 
endeavored to arrange ideas in order, often provid- 
ing references to multiple proofs and multiple view- 
points on key results, with reference to a variety of 
applied fields. Many of the key ideas relating to ma- 
jorization were already discussed in the (also justly 
celebrated) volume entitled Inequalities by Hardy, 
Littlewood and Polya (1934). Indeed, this slim vol- 
ume still merits occasional revisits since there re- 
main in it many "seedlings for further research" (to 
borrow Kingman's apt descriptive phase). Of course 
the Hardy, Littlewood and Polya volume, though 
slim and printed on small pages, was all meat and 
no gravy: more like a series of insightful telegrams. 
Only a relatively small number of researchers were 
inspired by it to work on questions relating to ma- 
jorization. 

But things were different after 1979. Marshall and 
Olkin sold the product much more effectively. When- 
ever a situation was encountered in which a solution 
or an extreme case involved a discrete uniform dis- 
tribution, the possibility of a majorization proof was 
now apparent if not to all, certainly to many, and 
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certainly in many different areas of research. More- 
over, if a uniform allocation or distribution was in 
a sense optimal, then the concept of majorization 
frequently could be used to order competing alloca- 
tions or distributions. 

Naturally extensions of the majorization concept 
were possible and indeed many have been fruitfully 
introduced. The focus of the present article is, how- 
ever, on classical majorization. The goal is to pro- 
vide a hint (via selected examples from the post- 
1979 literature) of the vast array of settings in which 
majorization provides a useful and interpretable or- 
dering. In no sense can such a survey be complete. I 
apologize, in advance, to researchers who, quite le- 
gitimately, can point to papers of their own which 
they feel would be even better illustrations of the 
theme: Majorization, here, there and everywhere. 
Nevertheless it is my hope that the examples se- 
lected will be found to be interesting, to be suffi- 
ciently diverse in order to illustrate the potential 
ubiquity of dispersion ordering (a.k.a. majorization) 
concepts and, perhaps, to inspire researchers to seek 
even more research niches in which majorization and 
Schur convexity will play a useful role. 

2. SOME NEEDED DEFINITIONS 

We will say that a vector x £ R" majorizes an- 
other vector y S R" and write xyyii for each 
fc = 1, 2, . . . , n — 1 we have 

k k 

^ ^ ^i: n ^ ^ ^ Vi : n 

1=1 1=1 

and 

n n 

^ ^ Xi;n — ^ ^ Vi -.n- 
i=l i=l 

In the above we denote the ordered coordinates of a 
vector X G R" by xi : „ < a;2 : n < • • • < a;n : n- 

A function g : R" ^ R is said to be Schur con- 
vex \i x>- y implies g{x) > g{y). For additional de- 
tails and alternative characterizations of majoriza- 
tion and Schur convexity, we naturally refer to Mar- 
shall and Olkin (1979). 

In short, the vector x majorizes y if the coordi- 
nates of X are more dispersed than are the coordi- 
nates of y, subject to the constraint that the sum of 
the coordinates of x and of y is the same. 

A Schur convex function then is one that increases 
as dispersion increases (where the concept of disper- 
sion used is specifically linked to the majorization 
order) . 



The extremal case under the majorization order 
corresponds to the choice xi = {Yl^=i^j)/^- P^-^" 
ticular then, a Schur convex function will take on 
a larger value when there is some variability in x 
than it does when there is no variability [i.e., when 
Xi = x = {YJj=iXj)/n,i = 1,2, . . . 

Many examples of Schur convex functions can of 
course be found in the literature. Perhaps the sim- 
plest example is what is called a separable convex 
function. It is of the form 

n 
i=l 

where /i is a convex function. 

We now begin our tour of examples in the liter- 
ature in which majorization makes cameo and/or 
starring appearances. 

One can even consider a variation of the children's 
game "Where's Waldo?" . In that game a very com- 
plicated picture is provided in which, hidden away, is 
a picture of the hero Waldo. He is always there, but 
he is often hard to find. Similarly we can view vari- 
ous areas of statistical research and/or applications 
as being rather complicated scenes in which perhaps 
Waldo, a.k.a. majorization, may well be lurking. The 
search begins. 

3. COVERING A CIRCLE WITH RANDOMLY 
PLACED ARCS 

Suppose that n arcs of lengths £i,£2>---)^n are 
placed independently and uniformly on the unit cir- 
cle (a circle with unit circumference). Let P{tj de- 
note the probability that the unit circle is com- 
pletely covered by these arcs. The problem is only 
interesting when the total length of the arcs L = 
J27=i^i exceeds 1, the circumference of the circle. 
We therefore assume that L > 1. In the special case 
in which the arcs are of equal lengths (say i = L/n), 
the required probability was provided by Stevens 
(1939). Specifically we have 

(3.1) p(4)=E(-i)'Y^)(i-M)r\ 

fc=0 ^ ^ 

At the other extreme, if one arc is of length L and 
the others of length 0, coverage is certain. It would 
appear then that, in this situation, increasing the 
variability among the fj's subject to the sum be- 
ing equal to L, might well be associated with an 
increase in the coverage probability. Proschan con- 
jectured that P{i) is a Schur convex function. It is 
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indeed Schur convex but it is not that easy to verify. 
Details were provided by Huffer and Shepp (1987). 
Not surprisingly, the argument is based on study- 
ing the effect on P{i) of making a small change in 
two unequal ii's (to make them more alike) holding 
the other lengths fixed. Waldo is here, but he is not 
easily unmasked. 

4. WAITING FOR A PATTERN 

If we seat a monkey at a keyboard and have him 
type letters, spaces and punctuation marks at ran- 
dom, it is common knowledge that eventually he will 
produce a perfectly typed version of the Gettysburg 
Address and, for that matter, the entire contents of 
the 2004 edition of the Encyclopedia Brittanica. But 
we would have to wait a rather long time to see this. 

The mathematical formulation of the monkey's 
activities involves observing a sequence Xi , X2 , ■ ■ ■ 
of independent identically distributed random vari- 
ables with possible values 1,2, ... ,k and associated 
positive probabilities pi,p2, . . . ,Pk- Let N denote the 
waiting time until a particular consecutive string of 
outcomes is observed, or one of a particular set of 
outcome strings is observed. If we are waiting for the 
string ti,t2, ■ ■ ■ ,t£ where each tj is a number cho- 
sen from the set 1,2, ... ,k, there are several ways in 
which variability can affect the waiting time random 
variable A^. The random variable will be affected by 
variability among the pj's, the probabilities of the 
individual possible values of the A's. It will be also 
affected by the variability among the tj's appearing 
in the string whose appearance we are awaiting. For 
example, we might expect to have to wait longer for 
a string of i consecutive like outcomes than for a 
string of i distinct outcomes. Possibilities for a role 
for majorization abound here. 

In particular, Ross (1999) considers the waiting 
time N until we observe a run of k observed values 
of the Aj's that includes all k of the possible values 
of the Aj's, function of p = (pi, . . . ,pk). Here 
indeed it is possible to verify that for every n, P{N > 
n) is a Schur convex function oip, and consequently 
that E{N) is also Schur convex as a function of p. 
The shortest waiting time is thus associated with 
the case in which the pj's are all equal to 1/k. 

5. PAIRED COMPARISONS 

The theory of paired comparisons has found con- 
siderable application in the study of professional 
sporting contests. At the end of a typical season 



each of the k teams in the league will have played 
each other team a given number, say p, of times. For 
simplicity, we ignore such factors as home field ad- 
vantage and we assume that the rules of the league 
exclude the possibility of ties. Similar analysis might 
well be applied to taste-testing experiments and other 
paired comparison scenarios, but we will follow Joe 
(1988) and focus on the sports setting. 

In modeling this scenario, it is convenient to con- 
sider a k X k matrix P = {pij) in which, for i 7^ 
j,Pij denotes the probability that team i will beat 
team j in a particular game. Of course we have pij + 
Pji = 1, recalling our assumption that ties do not 
occur. We leave the diagonal elements of P empty 
so that P has n(n — 1) nonnegative elements. The 
strength of a particular team, say team i, is to some 
extent measured by its corresponding row total pi = 
J2j^iPij- Foi' a given vector p of team strengths, we 
can consider the class P{p) of all probability matri- 
ces P with only off-diagonal elements defined and 
with row totals given by p. 

It is reasonable to assume that if team i is better 
than team j (i.e., if pij > 0.5) and if team j is better 
than team k, then team i should be better than team 
k. 

Joe calls the matrix P weakly transitive if pij > 
0.5 and pji^ > 0.5 imply > 0.5. A stronger con- 
dition is also plausible. He defines P to be strongly 
transitive if pij > 0.5 and pjk > 0.5 imply pik > 
max{pij,Pjk). 

Where does majorization come into this picture? 
Each matrix P in V{p) can be rearranged as an 
n X [n — l)-dimensional row vector denoted by P* . 
We wiU write P ^ Q iff P* -< Q* in the usual sense 
of majorization. A matrix P G V{p) is said to be 
minimal if Q -< P implies Q* = P* up to rearrange- 
ment. Joe (1988) verifies that any strong transitive 
P is minimal. Variations in which ties and home 
field advantage are considered are also discussed in 
Joe (1988). 

6. PHASE TYPE DISTRIBUTIONS 

In a continuous-time Markov chain with (n + 1) 
states, of which n states {1,2, ... ,n) are transient 
and state n + 1 is absorbing, the time T until ab- 
sorption in state n + 1 is said to have a phase type 
distribution (Neuts, 1975). Such distributions are 
parameterized by an initial distribution vector for 
the chain, a = (ai, 02, . . . , a„) (we assume that the 
probability of beginning in state n + 1 is 0), and 
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a matrix of intensities of transitions among the n 
transient states Q. The elements of Q satisfy qu < 
0, i = 1, 2, . . . , n, and qij > 0, j / i. In such a setting 
T is said to have a phase type distribution with 
parameters a and Q and we write T ~ PH{a,Q). 
A very simple example is the one in which a = 
a* = (l,0,0,0,...,0) and Q = Q* where g*. = -5,Vi 
and q*j = 5 for j = i + 1 while q^j = otherwise. In 
this situation the chain begins in state 1, and then 
successively moves through states 2,3, ... ,n, spend- 
ing an exponential (S) time in each state. Conse- 
quently the time to absorption, say T*, will be a 
sum of n i.i.d. exponential random variables and so 
T* ~ gamma(n, 6) (in queueing contexts this is of- 
ten called the Erlang distribution rather than the 
gamma distribution). 

We say that a phase type distribution is of order n 
if n is the smallest integer such that the distribution 
can be identified with the absorption time of a chain 
with n transient states and one absorbing state. It 
appears that, in some sense, T* exhibits the most 
regular behavior of any phase type distribution of 
order n. This can be made precise in terms of what 
is called the Lorenz order, a natural extension of 
majorization. 

Let C denote the class of nonnegative random 
variables with finite positive expectations. (This can 
be extended to allow the random variables to assume 
negative values, but for our present purposes this is 
not needed.) For X and y in £, we will write X <l 
Y iff E{g{X/E{X))) < E{g{Y/E{Y)) for every con- 
tinuous convex function g. Majorization can be iden- 
tified as a special case here by choosing X and Y to 
each have n equally likely values xi,X2, ■ ■ ■ ,Xn and 
2/2; ■ • ■ ) Vrii respectively, with E{X) = E(Y). More 
detailed discussion of the Lorenz order on C may be 
found in Arnold (1987). Aldous and Shepp (1987) 
showed that T* [with its gamma(n, 5) distribution] 
has the smallest coefficient of variation among phase 
type distribution of order n, that is, it minimizes 
E{{^^f). More generally, O'Cinneide (1991) ver- 
ified that T* <L T for any variable T that is phase 
type of order n, thus confirming the fact that T* 
exhibits the least "variability" (as measured by the 
Lorenz order). 

7. CATCHABILITY 

An island community contains an unknown num- 
ber 1/ of species of butterflies. Butterflies are sequen- 
tially trapped until n individuals have been cap- 
tured. Denote by r, the number of distinct species 



represented among the captured butterflies. We may 
well use r (and n) to help us estimate ly. 

A typical stochastic model for this problem is based 
on the assumption that butterflies from species j,j = 
1,2, ... ,1^, enter the trap according to a Poisson (Aj) 
process and that these Poisson processes are inde- 
pendent. Define pj = )^j/J2i=i^i- The probability 
that a particular butterfly trapped is from species 
j is then given by pj,j = 1,2,..., v. The pj's can 
be interpreted as measures of "catchability" of the 
various species. The simplest model is that of equal 
catchability (i.e., = 1/z/, j = 1, 2, . . . , u). If we as- 
sume that V <n, then, under the equal catchability 
model, a minimum variance unbiased estimate of u, 
based on r, exists. It is given by 

(7.1) v = S{n + l,r)/S{n,r) 

where S{n,x) is a Stirling number of the second 
kind. What happens when the species vary in catch- 
ability? In an extreme case in which one partic- 
ular species is easily trapped and the others are 
extremely difficult to trap, we will usually observe 
r = 1 and will consequently badly underestimate v. 
Indeed as Nayak and Christman (1992) observe, the 
random number R of species captured has a distri- 
bution which is a Schur convex function of p. Thus 
the estimate (7.1) and other estimates which are 
sensible under equal catchability will be negatively 
biased with the bias increasing as the catchability 
becomes more variable. 

8. DISEASE TRANSMISSION 

Tong (1997) identifies an interesting majorization 
feature of a disease transmission model due to 
Eisenberg (1991). Consider a closed population of 
n + 1 individuals. One individual (number n + 1) is 
susceptible to the disease but as yet is uninfected. 
The other n individuals are carriers of the disease. 
If individual n+1 has a single contact with individ- 
ual i, we denote the probability of avoiding infection 
hy Pi,i = 1,2,..., n. 

It is assumed that individual n + 1 makes a total of 
J contacts with individuals in the population in ac- 
cordance with a preference vector a = (ai , ai , 02, . • . , 
an)) where > 0, i = 1, 2, . . . , n, and X^iLi — ^■ 
In addition, individual n+1 has a lifestyle vector 
k = {ki,k2, . . . , kj) where the /cj's are nonnegative 
integers summing to J. For given vectors a and k, 
the individual n + 1 proceeds as follows. He/she first 
picks a partner from among the n carriers according 
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to the preference vector a. Thus he/she will select 
individual 1 with probability ai, individual 2 with 
probability 02 j and so on. He/she then makes ki 
contacts with this partner. Then he/she selects a 
second partner (it could be the same one) accord- 
ing to the preference vector a and has k2 contacts 
with this partner. The process terminates after all 
J = X]/=i contacts have been made. Denote the 
probability of escaping infection by H{k,a,p), de- 
pending as it does on lifestyle (fc), preference (a) 
and variable nontransmission probabilities {p). 

There are several possible roles for majorization 
here. Variability among the coordinates of fc,a 
and/or p can be expected to affect H{k,a,p). Tong 
(1997) focuses on the lifestyle vector k. Two extreme 
lifestyles are readily identified. The first one corre- 
sponds to k = ( J, 0, 0, ... , 0) which could be called 
a monogamous style. Here a partner is randomly 
chosen according to the preference vector a and all 
contacts are made with this individual. The sec- 
ond extreme lifestyle has fc = (1,1,1,... ,1). In this 
case each contact is made with a randomly cho- 
sen individual. The probability of escaping infection 
with A; = ( J, 0, . . . , 0) is clearly ^27=1 ^iPi while the 
probability of escaping infection using the lifestyle 
(1, 1, 1, . . . , 1) is (X)r=i OLiPiY ■ It follows via Jensen's 
inequality that one has a larger probability of es- 
caping infection with the "monogamous" lifestyle 
( J, 0, . . . , 0) than with the "random" lifestyle (1,1,1, 
. . . , 1). This holds for every a and every p. But of 
course these two lifestyles are extreme cases with re- 
gard to majorization. It is then quite plausible that 
the probability of escaping infection is a Schur con- 
vex function of the lifestyle vector k. Indeed, Tong 
(1997) confirms this conjecture. He also is able to 
get some results when the number J of contacts is a 
random variable. Several interesting aspects of this 
problem remain open. 

9. APPORTIONMENT IN PROPORTIONAL 
REPRESENTATION 

The ideal of one man-one vote is often approached 
by the device of proportional representation. Thus 
if there are N seats available and if a political party 
received 100g% of the votes, then ideally that party 
should be assigned Nq seats. But fractional seats 
cannot be assigned (or better yet are not assigned, 
since there seems to be no reason why they could 
not be assigned, except perhaps for aesthetic con- 
siderations). Which method of rounding should be 



used to arrive at an assignment of integer-valued 
numbers of seats to every party in a manner essen- 
tially reflecting proportional representation? This is 
not a new problem. Several very well-known Amer- 
ican politicians have proposed methods of round- 
ing for use in this situation. Balinski and Young 
(2001) provide a good survey of the methods usually 
considered. Marshall, Olkin and Pukelsheim (2002) 
highlight the role of majorization in comparing the 
various candidate rounding methods. John Quincy 
Adams proposed a method that was kind to small 
parties (rounding up their representation), while at 
the other extreme Thomas Jefferson urged rounding 
down, which favors large parties. Other popular in- 
termediate strategies are associated with the names 
Dean, Hill and Webster. 

It is easiest to describe all of these apportion- 
ment methods in terms of a sequence of signposts 
which determine rounding decisions. The signposts 
s{k) are numbers in the interval [k, A; -|- 1] such that 
s{k) is a strictly increasing function of k. The cor- 
responding rounding rule is that a number in the 
interval [A;, A; + 1] is rounded down if it is less than 
s{k) and is rounded up if it is greater than s{k). If 
the number is exactly equal to s{k), then we may 
round up or down. So-called power-mean signpost 
sequences have been popular. They are of the form 

fkp {k + iY\^ip 

— 00 <p< 00. 

The five most popular apportionment methods can 
all be viewed as having been based on a particu- 
lar power-mean signpost sequence. The Adams rule 
(rounding up) corresponds to p = —00, the Dean 
rule corresponds to p= —1, the Hill rule corresponds 
to p = 0, the Webster rule to p = 1 and finally the 
Jefferson rule (rounding down) corresponds to p = 
00. Marshall, Olkin and Pukelsheim (2002) show 
that the seating vector produced by a power-mean 
rounding rule of order p will always be majorized 
by the seating vector produced by a power-mean 
rounding rule of order p' if and only if p<p' . Con- 
sequently, among the five popular apportionment 
rules, the change when moving from the Adams rule 
toward the Jefferson rule is a change in favor of 
large parties in a majorization sense. The move from 
an Adams apportionment toward a Jefferson appor- 
tionment can actually be accomplished by a series of 
single seat reassignments from a poorer party (with 
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fewer votes) to a richer party (with more votes) [par- 
allehng reverse Robin Hood (a.k.a. Pigou-Dalton) 
income transfers in an economic setting]. 

10. MAJORIZATION IN STATISTICAL 
MECHANICS 

The state space of a physical system, Sn., can be 
identified with the set of all probability vectors p = 
{pi,P2,... ,Pn)' where Pi > and EILiP* = 1- A use- 
ful partial order in this context is related to the 
information content of the states. For two states 
p and it is prescribed that p ~< q \S there exists 
a doubly stochastic matrix T with p = Tq. But of 
course, appealing to the classical result of Hardy, 
Littlewood and Polya (1929), this is in fact the ma- 
jorization partial order (and the notation is thus 
consistent with our usage in earlier sections of this 
paper). In this context separable concave functions 
are called generalized entropies. 

A related partial order is defined on /c-tuples of 
states. For two /c-tuples {p^,p^, . . . ,PjJ and {q^,q^, 
■ ■ ■ 1 QjJ we define 

iff there exists a stochastic matrix T such that p. = 

—I 

Tq., i = 1,2, . . . ,k. In particular when k = 2, a par- 
tial ordering defined with respect to a reference 
state s becomes of interest. The partial order rel- 
ative to s is defined by 

(10.1) p^sQ if^ip,s)^^^HQ,s). 

It may be noted that if s is chosen to be equal to 
e = (^,...,^), then the corresponding partial order 
(relative to e) coincides with the usual majorization 
order. Thus the partial ordering is a genuine ex- 
tension of the classical majorization order. 

Dynamic processes in the state space Sfi can be 
identified with indexed families of stochastic matri- 
ces. Such processes which preserve the s-partial or- 
der have been studied in some detail. A convenient 
introductory reference is Zylka (1985). 

Schur convex functions and analogous s-Schur con- 
vex functions turn out to have useful thermody- 
namic interpretation in this context. 

11. CONNECTED COMPONENTS 
IN A RANDOM GRAPH 

Ross (1981) considers a random graph with nodes 
numbered l,2,...,n. Suppose that X{1), X{2), . . . , 



X{n) are independent identically distributed ran- 
dom variables each with possible values l,2,...,n 
and with common distribution defined by 

(11.1) P{X{i)=j)=pj, j = l,2,...,n, 

where pj > 0, Vj and J2^=iPj = 1- We construct the 
random graph by drawing the n random arcs (i, X{i)), 
i = 1,2, . . . ,n.ln this manner, one arc emanates from 
each node. However, of course, several arcs can ter- 
minate at the same node. The resulting graph will 
have a random number of connected components. A 
connected component of the graph is a set of nodes 
such that any pair of them is linked by an arc in the 
graph, and there are no arcs joining any nodes in 
the set with any node outside the set. Let us de- 
note the random number of such connected sub- 
sets by M. The distribution of M will of course 
be influenced by the probability vector p, appear- 
ing in (11.1), which governs the distribution of the 
random arcs X{1),X{2), . . . , X{n). 

For example, if p = (1, 0, 0, . . . , 0), then all arcs will 
terminate at node 1 and there will be a single con- 
nected subset of nodes in the random graph, that is, 
M = 1. 

The following expression for the expected value of 
M is provided by Ross: 

(11.2) EiM) = J2{\S\-l)\Y[pj 

S jGS 

where the summation extends over all nonempty 
subsets of {1, 2, . . . , n}. It is then possible, using this 
expression, to verify that E{M) is a Schur concave 
function of p. Consequently the expected number of 
connected components of the graph is maximized if 
Pj = ^1^,3 = l,2,...,n. 

12. A STOCHASTIC RELATION BETWEEN 
THE SUM OF TWO RANDOM VARIABLES 
AND THEIR MAXIMUM 

Suppose that X = {X\,X2) is a random vector 
with nonnegative coordinate random variables X\ , X2 ■ 
It is often of interest to compare the tail behavior 
of X\,X2 with that of max(Xi,X2). In the context 
of construction of confidence intervals for the differ- 
ence between normal means with unequal variances 
(a Behrens-Fisher setting), Dalai and Fortini (1982) 
identified a sufficient condition for stochastic order- 
ing between X\ + X2 and \/2max(Xi, X2) that in- 
volves Schur convexity. Specifically they prove that 
a sufficient condition for 

P{Xi+X2 < c) > P{V2max{Xi,X2) < c) 
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for any c > 0, is that the joint density of {Xi,X2), 
say f{xi,X2), is such that f{^/xl, y/x2) is a Schur 
convex function of x. The proof involves condition- 
ing on Xf + X2 and observing that on any curve 
Xi + X2 = t, the joint density f{xi,X2) increases as 
one moves away from the hne xi = X2- 

An important special case in which the hypothe- 
ses are satisfied is the situation in which {Xi,X2) = 
m\,\Y2\) where i:~iV(2) (0^^2(1 

A related n-dimensional result is also provided by 
Dalai and Fortini (1982). They show that if Xi,X2, 
. . . , Xn are i.i.d. positive random variables with com- 
mon density / and if logf{^/x) is concave and f{x)/x 
is nonincreasing, then 

n 

<st y/nmax{Xi,X2,...,Xn). 

1=1 

13. FURTHER EXAMPLES 

The list could be continued. Schur convexity and 
majorization can be found in many other settings. 
To conclude our short survey, we will merely men- 
tion briefly a few more interesting settings in which 
Waldo appears: 

(i) the study of peakedness of univariate and 
multivariate distributions, 

(ii) admissibility of tests in multivariate analysis 
of variance, 

(iii) probability content of regions for a Schur con- 
cave joint density, 

(iv) the study of diversity in ecological environ- 
ments, 

(v) income and wealth inequality measurement 
(with multivariate extensions). 

As observed in the Introduction, there are many 
more examples in the literature and there is no rea- 
son to believe that the search for new applications of 
majorization and Schur convexity will falter in the 
next 25 years. When the Inequalities volume cele- 
brates its golden jubilee, an even more extensive and 
fascinating array of appearances can be confidently 
predicted. The search for Waldo will continue apace. 
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